Skip to content

fix: normalize agent IDs and remove bootstrap files for benchmark#37

Merged
olearycrew merged 4 commits intopinchbench:mainfrom
zhuanghaoz:fix/agent-id-normalization
Mar 19, 2026
Merged

fix: normalize agent IDs and remove bootstrap files for benchmark#37
olearycrew merged 4 commits intopinchbench:mainfrom
zhuanghaoz:fix/agent-id-normalization

Conversation

@zhuanghaoz
Copy link
Contributor

  • Fix agent ID normalization to handle lowercase transformation
  • Remove BOOTSTRAP.md, SOUL.md, USER.md, IDENTITY.md before running tasks
  • Fix model ID normalization to preserve provider-qualified models (e.g., minimax-cn/)

These fixes ensure benchmark tasks work correctly with OpenClaw agents.

@kilo-code-bot
Copy link
Contributor

kilo-code-bot bot commented Mar 9, 2026

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 2
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
scripts/lib_agent.py 232 Redundant import shutil inside loop — already imported at line 180 in the same function
scripts/lib_agent.py 235 shutil.copytree follows symlinks by default, potentially copying sensitive data into benchmark workspace
Files Reviewed (1 file)
  • scripts/lib_agent.py - 2 issues

Fix these issues in Kilo Cloud


Reviewed by claude-4.6-opus-20260205 · 244,628 tokens

- Fix agent ID normalization to handle lowercase transformation
- Remove BOOTSTRAP.md, SOUL.md, USER.md, IDENTITY.md before running tasks
- Fix model ID normalization to preserve provider-qualified models (e.g., minimax-cn/)

These fixes ensure benchmark tasks work correctly with OpenClaw agents.
@zhuanghaoz zhuanghaoz force-pushed the fix/agent-id-normalization branch from 0c5418b to 0cccb85 Compare March 9, 2026 14:46
- Copy skills from main workspace to benchmark workspace so agents can use nano-pdf
- Add 2-second delay before grading to ensure files are flushed to disk
- Fix model ID normalization to preserve provider-qualified models
@olearycrew
Copy link
Member

@zhuanghaoz thanks for this contribution

I am wondering if "Remove BOOTSTRAP.md, SOUL.md, USER.md, IDENTITY.md before running tasks" is a good idea - I had problems early in this project with OpenClaw not linking not having those and getting lost on actual task completion

@ScuttleBot
Copy link

👋 Hi @zhuanghaoz! I'm @olearycrew's OpenClaw bot doing a triage pass.

This PR has merge conflicts with main that need to be resolved before it can be merged.

Just flagging in case you missed it!

zhuanghaoz and others added 2 commits March 13, 2026 20:02
Co-authored-by: kilo-code-bot[bot] <240665456+kilo-code-bot[bot]@users.noreply.github.com>
@olearycrew olearycrew merged commit 05727c1 into pinchbench:main Mar 19, 2026
if skill_dir_src.is_dir():
dest_skill_dir = dest_skills_dir / skill_dir_src.name
# Copy skill directory
import shutil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Redundant import shutil inside a loop — shutil is already imported at line 180 at the top of this same function. While Python caches module imports so this won't cause a runtime error, it's unnecessary and clutters the loop body. Move it to the top of the function alongside the existing import, or simply remove this line.

import shutil
if dest_skill_dir.exists():
shutil.rmtree(dest_skill_dir)
shutil.copytree(skill_dir_src, dest_skill_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: shutil.copytree follows symlinks by default. If any skill directory under ~/.openclaw/workspace/skills/ contains symlinks pointing to sensitive locations (e.g., ~/.ssh, credentials), those files will be fully copied into the benchmark workspace and become accessible to the benchmark agent.

Consider using shutil.copytree(skill_dir_src, dest_skill_dir, symlinks=True) to preserve symlinks without following them, or ignore_dangling_symlinks=True at minimum. This hardens the copy against unintended data exposure in the benchmark workspace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants